Revert "[SPARK-56975][SS] Reject user-specified schema in DataStreamReader.table()"#56189
Closed
PorridgeSwim wants to merge 2 commits into
Closed
Revert "[SPARK-56975][SS] Reject user-specified schema in DataStreamReader.table()"#56189PorridgeSwim wants to merge 2 commits into
PorridgeSwim wants to merge 2 commits into
Conversation
anishshri-db
approved these changes
May 29, 2026
…eader.table()" This reverts commit 05b4d81.
e4042ab to
6170a60
Compare
anishshri-db
pushed a commit
that referenced
this pull request
May 30, 2026
…eader.table()" ### What changes were proposed in this pull request? This reverts commit `05b4d81f3f938ff140886d6f66ad66d08c66d5b2` (SPARK-56975), which made `DataStreamReader.table()` reject a user-specified schema by calling `assertNoSpecifiedSchema("table")`. This restores the previous behavior, where a user-specified schema passed before `.table()` is accepted (and ignored). ### Why are the changes needed? SPARK-56975 is a behavior-breaking change. Code that previously ran successfully — e.g. `spark.readStream.schema(s).table(name)` — now throws an `AnalysisException` (`_LEGACY_ERROR_TEMP_1189`). While a schema has no effect on `.table()`, rejecting it outright breaks existing user workloads that set a schema on the `DataStreamReader` before calling `.table()`. A user-facing behavior change like this must go through the project's breaking-change process, which was not followed for SPARK-56975. We are reverting it to restore backward compatibility; a proper deprecation path can be pursued separately if the stricter behavior is still desired. ### Does this PR introduce _any_ user-facing change? Yes. It restores the pre-SPARK-56975 behavior: `DataStreamReader.table()` again accepts (and silently ignores) a user-specified schema instead of throwing `AnalysisException` (`_LEGACY_ERROR_TEMP_1189`). Since SPARK-56975 only landed in unreleased branches (`master` and `branch-4.2`), there is no change relative to any released Spark version. ### How was this patch tested? This is a straight `git revert`. Existing `DataStreamTableAPISuite` tests pass; the test added by SPARK-56975 (`"read: user-specified schema is not allowed with table API"`) is removed as part of the revert. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #56189 from PorridgeSwim/revert-SPARK-56975. Lead-authored-by: You Zhou <you.zhou@databricks.com> Co-authored-by: You Zhou <98635051+PorridgeSwim@users.noreply.github.com> Signed-off-by: Anish Shrigondekar <anish.shrigondekar@databricks.com> (cherry picked from commit 6039af8) Signed-off-by: Anish Shrigondekar <anish.shrigondekar@databricks.com>
anishshri-db
pushed a commit
that referenced
this pull request
May 30, 2026
…eader.table()" ### What changes were proposed in this pull request? This reverts commit `05b4d81f3f938ff140886d6f66ad66d08c66d5b2` (SPARK-56975), which made `DataStreamReader.table()` reject a user-specified schema by calling `assertNoSpecifiedSchema("table")`. This restores the previous behavior, where a user-specified schema passed before `.table()` is accepted (and ignored). ### Why are the changes needed? SPARK-56975 is a behavior-breaking change. Code that previously ran successfully — e.g. `spark.readStream.schema(s).table(name)` — now throws an `AnalysisException` (`_LEGACY_ERROR_TEMP_1189`). While a schema has no effect on `.table()`, rejecting it outright breaks existing user workloads that set a schema on the `DataStreamReader` before calling `.table()`. A user-facing behavior change like this must go through the project's breaking-change process, which was not followed for SPARK-56975. We are reverting it to restore backward compatibility; a proper deprecation path can be pursued separately if the stricter behavior is still desired. ### Does this PR introduce _any_ user-facing change? Yes. It restores the pre-SPARK-56975 behavior: `DataStreamReader.table()` again accepts (and silently ignores) a user-specified schema instead of throwing `AnalysisException` (`_LEGACY_ERROR_TEMP_1189`). Since SPARK-56975 only landed in unreleased branches (`master` and `branch-4.2`), there is no change relative to any released Spark version. ### How was this patch tested? This is a straight `git revert`. Existing `DataStreamTableAPISuite` tests pass; the test added by SPARK-56975 (`"read: user-specified schema is not allowed with table API"`) is removed as part of the revert. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #56189 from PorridgeSwim/revert-SPARK-56975. Lead-authored-by: You Zhou <you.zhou@databricks.com> Co-authored-by: You Zhou <98635051+PorridgeSwim@users.noreply.github.com> Signed-off-by: Anish Shrigondekar <anish.shrigondekar@databricks.com> (cherry picked from commit 6039af8) Signed-off-by: Anish Shrigondekar <anish.shrigondekar@databricks.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
This reverts commit
05b4d81f3f938ff140886d6f66ad66d08c66d5b2(SPARK-56975), which madeDataStreamReader.table()reject a user-specified schema by callingassertNoSpecifiedSchema("table"). This restores the previous behavior, where a user-specified schema passed before.table()is accepted (and ignored).Why are the changes needed?
SPARK-56975 is a behavior-breaking change. Code that previously ran successfully — e.g.
spark.readStream.schema(s).table(name)— now throws anAnalysisException(_LEGACY_ERROR_TEMP_1189). While a schema has no effect on.table(), rejecting it outright breaks existing user workloads that set a schema on theDataStreamReaderbefore calling.table().A user-facing behavior change like this must go through the project's breaking-change process, which was not followed for SPARK-56975. We are reverting it to restore backward compatibility; a proper deprecation path can be pursued separately if the stricter behavior is still desired.
Does this PR introduce any user-facing change?
Yes. It restores the pre-SPARK-56975 behavior:
DataStreamReader.table()again accepts (and silently ignores) a user-specified schema instead of throwingAnalysisException(_LEGACY_ERROR_TEMP_1189). Since SPARK-56975 only landed in unreleased branches (masterandbranch-4.2), there is no change relative to any released Spark version.How was this patch tested?
This is a straight
git revert. ExistingDataStreamTableAPISuitetests pass; the test added by SPARK-56975 ("read: user-specified schema is not allowed with table API") is removed as part of the revert.Was this patch authored or co-authored using generative AI tooling?
No.